Using traits of web macro scripts to predict reuse

نویسندگان

  • Christopher Scaffidi
  • Christopher Bogart
  • Margaret M. Burnett
  • Allen Cypher
  • Brad A. Myers
  • Mary Shaw
چکیده

To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a script is created. To provide such a model for web macro scripts, we identified script traits that might plausibly predict reuse, then used IBM CoScripter repository logs to statistically test how well each corresponded to actual reuse. These tests confirmed that the traits generally did correspond to higher levels of reuse as anticipated. We then developed a machine learning model that uses these traits as features to predict reuse of macros. Evaluating this model on repository logs showed that its accuracy is comparable to that of existing machine learning models for predicting reuse—but with a much simpler structure that lends itself to automatically generating explanations of predictions. Further tests revealed that the difference-ofproportions metric used internally by our model to incorporate information from features is more accurate than the traditional entropy-based metric used in most existing models. Sensitivity analysis revealed that our model is quite robust; its quality is greatly reduced only when parameters are set to such extreme values that the model becomes inordinately selective. Testing the model with individual traits showed that the most useful traits were related to author expertise and indicators of mass appeal. Based on these results, we outline opportunities for using our model to improve repositories of end-user code.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using traits of web macro scrips to predict reuse

To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a s...

متن کامل

Reuse in the world of end - user programmers

***[ChTitle]Reuse in the world of end-user programmers*** ***Authors: Christopher Scaffidi, Mary Shaw*** ***Abstract:*** End-user programmers often reuse one another’s code when creating new programs, but this reuse is rarely as clean or simple as the blackbox reuse that professional programmers aspire to achieve. In this chapter, we explore the motivations that drive reuse of end-user code, th...

متن کامل

Evaluation of cultivated and wild barley cultivars affinities using micro and macro-morphological traits of grain, pollen, and stomata. Sayyedh Masomeh Hosseini 1*, Mahlagh Ghorbanli 2 and Hossein Sabouri 3

In an ongoing research, 24 cultivated and wild cultivars of barley were evaluated for morphological characteristics of grain, pollen, and stomata. Traits of interest included length, width, and area. Results of variance analysis showed that all samples were differed in traits of stomata, grain, and pollen at probability levels of 1 and 5%, suggesting remarkable genetic variation among studied s...

متن کامل

Robust trait composition for Javascript

We introduce traits.js, a small, portable trait composition library for Javascript. Traits are a more robust alternative to multiple inheritance and enable object composition and reuse. traits.js is motivated by two goals: first, it is an experiment in using and extending Javascript’s recently added meta-level object description format. By reusing this standard description format, traits.js can...

متن کامل

Digging for diamonds: Identifying valuable end-user code in repositories

To a large extent, repositories of end-user code are “write-only”: much of the code that people publish never sees substantial reuse. Yet buried within these repositories are valuable pieces of code, though finding them is not always easy. In prior work, we developed a model that can predict, when a web macro is created, whether that script will be reused by anybody. In the current paper, we an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Vis. Lang. Comput.

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2010